Email Formality in the Workplace: A Case Study on the Enron Corpus
نویسندگان
چکیده
Email is an important way of communication in our daily life and it has become the subject of various NLP and social studies. In this paper, we focus on email formality and explore the factors that could affect the sender’s choice of formality. As a case study, we use the Enron email corpus to test how formality is affected by social distance, relative power, and the weight of imposition, as defined in Brown and Levinson’s model of politeness (1987). Our experiments show that their model largely holds in the Enron corpus. We believe that the methodology proposed in the paper can be applied to other social media domains and be used to test other linguistic or social theories.
منابع مشابه
Network Analysis with the Enron Email Corpus
We use the Enron email corpus to study relationships in a network by applying six different measures of centrality. Our results came out of an in-semester undergraduate research seminar. The Enron corpus is well suited to statistical analyses at all levels of undergraduate education. Through this article’s focus on centrality, students can explore the dependence of statistical models on initial...
متن کاملWork Hard, Play Hard: Email Classification on the Avocado and Enron Corpora
In this paper, we present an empirical study of email classification into two main categories “Business” and “Personal”. We train on the Enron email corpus, and test on the Enron and Avocado email corpora. We show that information from the email exchange networks improves the performance of classification. We represent the email exchange networks as social networks with graph structures. For th...
متن کاملEnron Emails as Graph Data Corpus for Large-scale Graph Querying Experimentation
In this paper we describe Enron email corpus in graph/network data format. Nodes of the graph are emails connected with named entities (NE) extracted from text like people, email addresses, telephone numbers. Edges are links between NE representing concurrence in same email part, paragraph, sentence or composite NE. Enron Graph corpus contains a few millions of nodes and it is quite large corpu...
متن کاملIntroducing the Enron Corpus
A large set of email messages, the Enron corpus, was made public during the legal investigation concerning the Enron corporation. This dataset, along with a thorough explanation of its origin, is available at http://www-2.cs.cmu.edu/~enron/. This paper provides a brief introduction and analysis of the dataset. The raw Enron corpus contains 619,446 messages belonging to 158 users. We cleaned the...
متن کاملAnalyzing Gossip in Workplace Email
We spend a significant part of our lives chatting about other people. In other words, we all gossip. Although sometimes a contentious topic, various researchers have shown gossip to be fundamental to social life—from small groups to large, formal organizations. Adopting the Enron email dataset and natural language techniques, we present the first study of gossip in a large CMC corpus. We find t...
متن کامل